{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MctthPQNUiMt"
      },
      "source": [
        "##### Copyright 2024 Google LLC."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "0sK9GK2mUr4Z"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EecCBd3afA7C"
      },
      "source": [
        "# Question Answering using Gemini, LlamaIndex, and Chroma"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XhAqH8SXfLhn"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/google/generative-ai-docs/blob/main/examples/gemini/python/llamaindex/Gemini_LlamaIndex_QA_Chroma_WebPageReader.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/google/generative-ai-docs/blob/main/examples/gemini/python/llamaindex/Gemini_LlamaIndex_QA_Chroma_WebPageReader.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "itTLwgnvkrfD"
      },
      "source": [
        "## Overview\n",
        "\n",
        "[Gemini](https://ai.google.dev/models/gemini) is a family of generative AI models that lets developers generate content and solve problems. These models are designed and trained to handle both text and images as input.\n",
        "\n",
        "[LlamaIndex](https://www.llamaindex.ai/) is a simple, flexible data framework that can be used by Large Language Model(LLM) applications to connect custom data sources to LLMs.\n",
        "\n",
        "[Chroma](https://docs.trychroma.com/) is an open-source embedding database focused on simplicity and developer productivity. Chroma allows users to store embeddings and their metadata, embed documents and queries, and search the embeddings quickly.\n",
        "\n",
        "In this notebook, you'll learn how to create an application that answers questions using data from a website with the help of Gemini, LlamaIndex, and Chroma."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fZQLFShAnNDA"
      },
      "source": [
        "## Setup\n",
        "\n",
        "First, you must install the packages and set the necessary environment variables.\n",
        "\n",
        "### Installation\n",
        "\n",
        "Install LlamaIndex's python library, `llama-index`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "iK3BOx4u6ssQ"
      },
      "outputs": [],
      "source": [
        "# This guide was tested with 0.10.17, but feel free to try newer versions.\n",
        "!pip install -q llama-index==0.10.17"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mzd3CN4yKGpw"
      },
      "source": [
        "Install LlamaIndex's integration package for Gemini, `llama-index-llms-gemini`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "aUn_Pdp_KGyM"
      },
      "outputs": [],
      "source": [
        "!pip install -q llama-index-llms-gemini"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vWqDXJhuKG-B"
      },
      "source": [
        "Install LlamaIndex's integration package for Gemini embedding model, `llama-index-embeddings-gemini`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "IdAKIYmdKHH3"
      },
      "outputs": [],
      "source": [
        "!pip install -q llama-index-embeddings-gemini"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_9XYs842I-6G"
      },
      "source": [
        "Install LlamaIndex's web page reader, `llama-index-readers-web`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "HnFd2OWXI_M-"
      },
      "outputs": [],
      "source": [
        "!pip install -q llama-index-readers-web"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "muHuC24HoCBq"
      },
      "source": [
        "Install Chroma's python client SDK, `chromadb`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "a0OB7mnI8KMw"
      },
      "outputs": [],
      "source": [
        "!pip install -q chromadb"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ofnYCgA8ojgz"
      },
      "source": [
        "### Grab an API Key\n",
        "\n",
        "To use Gemini you need an *API key*. You can create an API key with one click in [Google AI Studio](https://makersuite.google.com/).\n",
        "After creating the API key, you can either set an environment variable named `GOOGLE_API_KEY` to your API Key or pass the API key as an argument when using the `Gemini` class to access Google's `gemini-1.5-flash` and `gemini-1.5-pro` models or the `GeminiEmbedding` class to access Google's Generative AI embedding model using `LlamaIndex`.\n",
        "\n",
        "In this tutorial, you will set the variable `gemini_api_key` to configure Gemini to use your API key."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Wp3pYcnh60vt"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Gemini API Key:··········\n"
          ]
        }
      ],
      "source": [
        "# Run this cell and paste the API key in the prompt\n",
        "import os\n",
        "import getpass\n",
        "\n",
        "gemini_api_key = getpass.getpass('Gemini API Key:')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gSS20CpqhaDY"
      },
      "source": [
        "## Basic steps\n",
        "LLMs are trained offline on a large corpus of public data. Hence they cannot answer questions based on custom or private data accurately without additional context.\n",
        "\n",
        "If you want to make use of LLMs to answer questions based on private data, you have to provide the relevant documents as context alongside your prompt. This approach is called Retrieval Augmented Generation (RAG).\n",
        "\n",
        "You will use this approach to create a question-answering assistant using the Gemini text model integrated through LlamaIndex. The assistant is expected to answer questions about Google's Gemini model. To make this possible you will add more context to the assistant using data from a website.\n",
        "\n",
        "In this tutorial, you'll implement the two main components in a RAG-based architecture:\n",
        "\n",
        "1. Retriever\n",
        "\n",
        "    Based on the user's query, the retriever retrieves relevant snippets that add context from the document. In this tutorial, the document is the website data.\n",
        "    The relevant snippets are passed as context to the next stage - \"Generator\".\n",
        "\n",
        "2. Generator\n",
        "\n",
        "    The relevant snippets from the website data are passed to the LLM along with the user's query to generate accurate answers.\n",
        "\n",
        "You'll learn more about these stages in the upcoming sections while implementing the application."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nPKvt5_5x6rH"
      },
      "source": [
        "## Import the required libraries"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "JXkg7O9PJfe3"
      },
      "outputs": [],
      "source": [
        "from bs4 import BeautifulSoup\n",
        "from IPython.display import Markdown, display\n",
        "from llama_index.core import Document\n",
        "from llama_index.core import Settings\n",
        "from llama_index.core import SimpleDirectoryReader\n",
        "from llama_index.core import StorageContext\n",
        "from llama_index.core import VectorStoreIndex\n",
        "from llama_index.readers.web import SimpleWebPageReader\n",
        "\n",
        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
        "\n",
        "import chromadb\n",
        "import re"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fCU_lprVhixQ"
      },
      "source": [
        "## 1. Retriever\n",
        "\n",
        "In this stage, you will perform the following steps:\n",
        "\n",
        "1. Read and parse the website data using LlamaIndex.\n",
        "\n",
        "2. Create embeddings of the website data.\n",
        "\n",
        "    Embeddings are numerical representations (vectors) of text. Hence, text with similar meaning will have similar embedding vectors. You'll make use of Gemini's embedding model to create the embedding vectors of the website data.\n",
        "\n",
        "3. Store the embeddings in Chroma's vector store.\n",
        "    \n",
        "    Chroma is a vector database. The Chroma vector store helps in the efficient retrieval of similar vectors. Thus, for adding context to the prompt for the LLM, relevant embeddings of the text matching the user's question can be retrieved easily using Chroma.\n",
        "\n",
        "4. Create a Retriever from the Chroma vector store.\n",
        "\n",
        "    The retriever will be used to pass relevant website embeddings to the LLM along with user queries."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FergxGcKh_b_"
      },
      "source": [
        "### Read and parse the website data\n",
        "\n",
        "LlamaIndex provides a wide variety of data loaders. To read the website data as a document, you will use the `SimpleWebPageReader` from LlamaIndex.\n",
        "\n",
        "To know more about how to read and parse input data from different sources using the data loaders of LlamaIndex, read LlamaIndex's [loading data guide](https://docs.llamaindex.ai/en/stable/understanding/loading/loading.html)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "xIYUiPPNjMrr"
      },
      "outputs": [],
      "source": [
        "web_documents = SimpleWebPageReader().load_data(\n",
        "    [\"https://blog.google/technology/ai/google-gemini-ai/\"]\n",
        ")\n",
        "\n",
        "# Extract the content from the website data document\n",
        "html_content = web_documents[0].text"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TamoAP7ckyvB"
      },
      "source": [
        "You can use variety of HTML parsers to extract the required text from the html content.\n",
        "\n",
        "In this example, you'll use Python's `BeautifulSoup` library to parse the website data. After processing, the extracted text should be converted back to LlamaIndex's `Document` format."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "-90BtEGikzt1"
      },
      "outputs": [],
      "source": [
        "# Parse the data.\n",
        "soup = BeautifulSoup(html_content, 'html.parser')\n",
        "p_tags = soup.findAll('p')\n",
        "text_content = \"\"\n",
        "for each in p_tags:\n",
        "    text_content += each.text + \"\\n\"\n",
        "\n",
        "# Convert back to Document format\n",
        "documents = [Document(text=text_content)]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sq-MBiAgw1ba"
      },
      "source": [
        "### Initialize Gemini's embedding model\n",
        "\n",
        "To create the embeddings from the website data, you'll use Gemini's embedding model, **embedding-001** which supports creating text embeddings.\n",
        "\n",
        "To use this embedding model, you have to import `GeminiEmbedding` from LlamaIndex. To know more about the embedding model, read Google AI's [language documentation](https://ai.google.dev/models/gemini)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Ezv0-TIiFkxv"
      },
      "outputs": [],
      "source": [
        "from llama_index.embeddings.gemini import GeminiEmbedding\n",
        "\n",
        "gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name=\"models/embedding-001\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vJB_fdQmoq_6"
      },
      "source": [
        "### Initialize Gemini\n",
        "\n",
        "You must import `Gemini` from LlamaIndex to initialize your model.\n",
        " In this example, you will use **gemini-pro**, as it supports text summarization. To know more about the text model, read Google AI's [model documentation](https://ai.google.dev/models/gemini).\n",
        "\n",
        "You can configure the model parameters such as ***temperature*** or ***top_p***,  using the  ***generation_config*** parameter when initializing the `Gemini` LLM.  To learn more about the model parameters and their uses, read Google AI's [concepts guide](https://ai.google.dev/docs/concepts#model_parameters)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Gyq6YIh97quL"
      },
      "outputs": [],
      "source": [
        "from llama_index.llms.gemini import Gemini\n",
        "\n",
        "# To configure model parameters use the `generation_config` parameter.\n",
        "# eg. generation_config = {\"temperature\": 0.7, \"topP\": 0.8, \"topK\": 40}\n",
        "# If you only want to set a custom temperature for the model use the\n",
        "# \"temperature\" parameter directly.\n",
        "\n",
        "llm = Gemini(api_key=gemini_api_key, model_name=\"models/gemini-pro\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uNLuJ-TY4utI"
      },
      "source": [
        "### Store the data using Chroma\n",
        "\n",
        " Next, you'll store the embeddings of the website data in Chroma's vector store using LlamaIndex.\n",
        "\n",
        " First, you have to initiate a Python client in `chromadb`. Since the plan is to save the data to the disk, you will use the `PersistentClient`. You can read more about the different clients in Chroma in the [client reference guide](https://docs.trychroma.com/reference/Client).\n",
        "\n",
        "After initializing the client, you have to create a Chroma collection. You'll then initialize the `ChromaVectorStore` class in LlamaIndex using the collection created in the previous step.\n",
        "\n",
        "Next, you have to set `Settings` and create storage contexts for the vector store.\n",
        "\n",
        "`Settings` is a collection of commonly used resources that are utilized during the indexing and querying phase in a LlamaIndex pipeline. You can specify the LLM, Embedding model, etc that will be used to create the application in the `Settings`. To know more about `Settings`, read the [module guide for Settings](https://docs.llamaindex.ai/en/stable/module_guides/supporting_modules/settings.html).\n",
        "\n",
        "`StorageContext` is an abstraction offered by LlamaIndex around different types of storage. To know more about storage context, read the [storage context API guide](https://docs.llamaindex.ai/en/stable/api_reference/storage.html).\n",
        "\n",
        "The final step is to load the documents and build an index over them. LlamaIndex offers several indices that help in retrieving relevant context for a user query. Here you'll use the `VectorStoreIndex` since the website embeddings have to be stored in a vector store.\n",
        "\n",
        "To create the index you have to pass the storage context along with the documents to the `from_documents` function of `VectorStoreIndex`.\n",
        "The `VectorStoreIndex` uses the embedding model specified in the `Settings` to create embedding vectors from the documents and stores these vectors in the vector store specified in the storage context. To know more about the\n",
        "`VectorStoreIndex` you can read the [Using VectorStoreIndex guide](https://docs.llamaindex.ai/en/stable/module_guides/indexing/vector_store_index.html)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1Ohzkf-LJyHO"
      },
      "outputs": [],
      "source": [
        "# Create a client and a new collection\n",
        "client = chromadb.PersistentClient(path=\"./chroma_db\")\n",
        "chroma_collection = client.get_or_create_collection(\"quickstart\")\n",
        "\n",
        "# Create a vector store\n",
        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)\n",
        "\n",
        "# Create a storage context\n",
        "storage_context = StorageContext.from_defaults(vector_store=vector_store)\n",
        "\n",
        "# Set Global settings\n",
        "Settings.llm = llm\n",
        "Settings.embed_model = gemini_embedding_model\n",
        "\n",
        "# Create an index from the documents and save it to the disk.\n",
        "index = VectorStoreIndex.from_documents(\n",
        "    documents, storage_context=storage_context\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ir5pUZpNu3ly"
      },
      "source": [
        "### Create a retriever using Chroma\n",
        "\n",
        "You'll now create a retriever that can retrieve data embeddings from the newly created Chroma vector store.\n",
        "\n",
        "First, initialize the `PersistentClient` with the same path you specified while creating the Chroma vector store. You'll then retrieve the collection `\"quickstart\"` you created previously from Chroma. You can use this collection to initialize the `ChromaVectorStore` in which you store the embeddings of the website data. You can then use the `from_vector_store` function of `VectorStoreIndex` to load the index."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "FlAPuVLt4mBr"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "MMLU (massive multitask language understanding) is a benchmark that uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.\n"
          ]
        }
      ],
      "source": [
        "# Load from disk\n",
        "load_client = chromadb.PersistentClient(path=\"./chroma_db\")\n",
        "\n",
        "# Fetch the collection\n",
        "chroma_collection = load_client.get_collection(\"quickstart\")\n",
        "\n",
        "# Fetch the vector store\n",
        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)\n",
        "\n",
        "# Get the index from the vector store\n",
        "index = VectorStoreIndex.from_vector_store(\n",
        "    vector_store\n",
        ")\n",
        "\n",
        "# Check if the retriever is working by trying to fetch the relevant docs related\n",
        "# to the phrase 'MMLU' (Multimodal Machine Learning Understanding).\n",
        "# If the length is greater than zero, it means that the retriever is\n",
        "# functioning well.\n",
        "# You can ask questions about your data using a generic interface called\n",
        "# a query engine. You have to use the `as_query_engine` function of the\n",
        "# index to create a query engine and use the `query` function of query engine\n",
        "# to inquire the index.\n",
        "test_query_engine = index.as_query_engine()\n",
        "response = test_query_engine.query(\"MMLU\")\n",
        "print(response)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "10heqY7ilEsi"
      },
      "source": [
        "## 2. Generator\n",
        "\n",
        "The Generator prompts the LLM for an answer when the user asks a question. The retriever you created in the previous stage from the Chroma vector store will be used to pass relevant embeddings from the website data to the LLM to provide more context to the user's query.\n",
        "\n",
        "You'll perform the following steps in this stage:\n",
        "\n",
        "1. Create a prompt for answering any question using LlamaIndex.\n",
        "    \n",
        "2. Use a query engine to ask a question and prompt the model for an answer."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iCLTx4zSxSll"
      },
      "source": [
        "### Create prompt templates\n",
        "\n",
        "You'll use LlamaIndex's [PromptTemplate](https://docs.llamaindex.ai/en/stable/module_guides/models/prompts.html) to generate prompts to the LLM for answering questions.\n",
        "\n",
        "In the `llm_prompt`, the variable `query_str` will be replaced later by the input question, and the variable `context_str` will be replaced by the relevant text from the website retrieved from the Chroma vector store."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "V96dQi1uOzfr"
      },
      "outputs": [],
      "source": [
        "from llama_index.core import PromptTemplate\n",
        "\n",
        "template = (\n",
        "    \"\"\" You are an assistant for question-answering tasks.\n",
        "Use the following context to answer the question.\n",
        "If you don't know the answer, just say that you don't know.\n",
        "Use five sentences maximum and keep the answer concise.\\n\n",
        "Question: {query_str} \\nContext: {context_str} \\nAnswer:\"\"\"\n",
        ")\n",
        "llm_prompt = PromptTemplate(template)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-aE0YWHT7bal"
      },
      "source": [
        "### Prompt the model using Query Engine\n",
        "\n",
        "You will use the `as_query_engine` function of the `VectorStoreIndex` to create a query engine from the index using the `llm_prompt` passed as the value for the `text_qa_template` argument. You can then use the `query` function of the query engine to prompt the LLM. To know more about custom prompting in LlamaIndex, read LlamaIndex's [prompts usage pattern documentation](https://docs.llamaindex.ai/en/stable/module_guides/models/prompts/usage_pattern.html#defining-a-custom-prompt)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "klNUEBbP3xbr"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Gemini is the most capable and general model that Google has ever built. It is a multimodal AI model that can understand and generate text, images, and code. Gemini is being used to power new features in a range of Google products, including Bard, Pixel, Search, and Ads.\n"
          ]
        }
      ],
      "source": [
        "# Query data from the persisted index\n",
        "query_engine = index.as_query_engine(text_qa_template=llm_prompt)\n",
        "response = query_engine.query(\"What is Gemini?\")\n",
        "print(response)"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "name": "Gemini_LlamaIndex_QA_Chroma_WebPageReader.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
